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Abstract 

Background: The information of electromyographic signals can be used by 
Myoelectric Control Systems (MCSs) to actuate prostheses. These devices allow the 
performing of movements that cannot be carried out by persons with amputated 
limbs. The state of the art in the development of MCSs is based on the use of individual 
principal component analysis (iPCA) as a stage of pre-processing of the classifiers. The 
iPCA pre-processing implies an optimization stage which has not yet been deeply 
explored. 

Methods: The present study considers two factors in the iPCA stage: namely A (the 
fitness function), and B (the search algorithm). The A factor comprises two levels, 
namely A] (the classification error) and Aj (the correlation factor). Otherwise, the B factor 
has four levels, specifically B] (the Sequential Forward Selection, SFS), Bj (the Sequential 
Floating Forward Selection, SFFS), B 3 (Artificial Bee Colony , ABC), and B 4 (Particle 
Swarm Optimization, PSO).This work evaluates the incidence of each one of the eight 
possible combinations between A and B factors over the classification error of the MCS. 

Results: A two factor ANOVA was performed on the computed classification errors 
and determined that: (1) the interactive effects over the classification error are not 
significative (Fo.013,72 = 4.0659 > = 0.09), (2) the levels of factor A have significative 
effects on the classification error (Fo.02,1,72 = 5.0162 < f/\ = 6.56), and (3) the levels of 
factor B over the classification error are not significative (Fo.01,3,72 = 4.0659 > fg = 0.08). 
Conclusions: Considering the classification performance we found a superiority of 
using the factor/^ in combination with any of the levels of factor B. With respect to the 
time performance the analysis suggests that the PSO algorithm is at least 14 percent 
better than its best competitor. The latter behavior has been observed for a particular 
configuration set of parameters in the search algorithms. Future works will investigate 
the effect of these parameters in the classification performance, such as length of the 
reduced size vector, number of particles and bees used during optimal search, the 
cognitive parameters in the PSO algorithm as well as the limit of cycles to improve a 
solution in the ABC algorithm. 

Keywords: Artificial bee colony (ABC), Electromyography, Individual principal 
component analysis (iPCA), Myoelectric control, Particle Swarm optimization (PSO), 
Prosthesis, Pattern recognition 
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Introduction 

The loss of capabilities to manipulate objects with the hands is considered a high impact 
disease due to the physical, psychological and financial sequels related [1,2]. A current 
technological aid for upper limb amputation and deficiency is represented by the use of a 
myoelectric hand prosthesis. This device overcomes the inability to grasp and manipulate 
objects [3] and in some cases to sense and explore the surrounding world [4]. A myoelec- 
tric prosthesis is composed of two main systems: (a) the prosthesis and (b) the Myoelectric 
Control System (MCS), which is the topic of this work. The electromyographic signals 
(EMG) are used as an input of the MCS, which classifies patterns of the EMG signals and 
operate actuators in the prosthesis. 

The conventional MCS classifies the myoelectric patterns in movement classes. The 
processing in the MCS is composed of three stages: feature extraction, dimensional- 
ity reduction and classification. Recently, Hargrove et al. [5] proposed a complementary 
architecture which improves the performance of the conventional MCS. This new archi- 
tecture mixes 3 components: (a) a high crosstalk level EMG acquisition system, (b) an 
individual Principal Component Analysis (iPCA) transformation stage and (c) the con- 
ventional myoelectric control system. The main disadvantage of the iPCA projection is 
the dimensionality increment of the electromyographic patterns, in which the pattern 
dimension N is incremented by a factor corresponding to the control systems number of 
classes C. 

To deal with the dimensionality increment it is possible to compute a reduced iPCA 
transformation that generates just the N\ most discriminative dimensions instead of the 
complete set CxN (with N\ < CxN). The reduced iPCA transformation is used in [5] and 
requires an optimization process that intends to find a subset of N\ elements from a given 
set of C xN channels, warranting the maximum discrimination information in the selected 
subset. The optimization is based on a search strategy and a fitness function, i.e. a function 
that measures the quantity of discriminative information in the selected subset of channels. 

The optimization in myoelectric control systems with iPCA tuning is explored by the 
authors in [6], where the analysis is focused on the effects of factors A and B over 
the performance of myoelectric control systems with iPCA tuning; with A representing 
the fitness function factor with two levels (A\ classification error and A2 correlation 
factor) and B representing the search algorithm factor with three levels (£1 sequential 
forward selection (SFS), B2 sequential floating forward selection, B3 artificial bee colony 
(ABC)). Our paper presents new results that complete the investigation in [6]. Specifically, 
we evaluate a new level for the B factor: B4 particle swarm optimization (PSO) for a total 
of eight treatment alternatives. 

The paper is organized as follows: Section 'Related works' discusses previous works in 
this research area. Section 'Background' describes the principal component analysis, the 
problem of channel optimization, the search algorithms and the fitness functions consid- 
ered. In Section 'Methods' we present the experimental methodology. In Section 'Results' 
the simulation results are given. In Section 'Discussion' the results are discussed, and 
finally the conclusions are presented in Section 'Conclusions'. 

Related works 

Different approaches for the control of myoelectric prostheses have been used. These 
can be grouped in two trends [7]: (a) coordinated control [8] and (b) sequential control 
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which is the main topic of the current study. In the latter, the degrees of freedom are actu- 
ated one at a time, in a sequence, as one carries out a multifunction task. These systems 
are based on pattern recognition architectures where the patterns are represented by the 
EMG signal characteristics (x) and the classes are represented by the movements of the 
prosthesis (y). 

Starting from one of the first works that made use of a multilayer perceptron (MLP) 
neural networks (NNs) [9-11], various classifiers such as linear discriminant analysis 
(LDA) [12-14], (neuro) fuzzy [15-20], Gaussian mixture models (GMMs) [21,22], hidden 
Markov models (HMMs) [18], and support vector machines (SVMs) [23-25] have been 
used. Some commonly investigated feature sets include time domain (TD) features: MAV, 
MAVS, ZC, SSC, WL [12,26], autoregressive (AR) coefficients [27], cepstral coefficients 
[28], the short- time Fourier transform (STFT), the wavelet transform (WT), the wavelet 
packet transform [13] (WPT), and concatenated TD and AR (TDAR) [21,29] features. 
These solutions have reached different classification errors, depending on the number of 
classes and other factors (see Table 1 for a summary of the different errors reported in last 
years). For instance, in [29] an error of 7.4% is reported in a problem with 7 movement 
classes and 8 electromyography channels. Recently, a new strategy has been proposed in 
which the temporal-spatial information, contained within muscle crosstalk, may implic- 
itly add class discriminatory information to the classification problem [21]. This proposal 
has been investigated by Hargrove [5] observing a significative reduction in the classifica- 
tion error. In a problem with 7 movement classes and 6 EMG channels the system yielded 
a classification error of 1.9%. 

The proposal in [5] mixes 3 components: (a) a high crosstalk level EMG acquisition 
system (i.e. an acquisition system with an special distribution of electrodes that permits 
that the signal generated by the M\ muscle can be detected by the electrode associated 
with the M2 muscle, where muscles M\ and M2 are near [7]), (b) an iPCA transforma- 
tion stage and (c) the conventional myoelectric control system. The main disadvantage of 
the iPCA projection is the dimensionality increment of the electromyographic patterns, 
in which the original pattern dimension N is incremented by a factor corresponding to 
the control systems number of classes C, i.e. it generates a new pattern with dimension 
C times N. To deal with the dimensionality increment it is possible to compute a reduced 
iPCA transformation that computes just the N\ most discriminative dimensions instead 
of the complete set C x N (with N\ < C x N). This solution requires an optimization 
search to find a subset of N\ elements from a given set of C x N channels, warranting 
the maximum discrimination information in the selected subset. Because of the time of 
response of these systems, the optimization process must be conducted during a configu- 
ration stage, before the classification tasks. This optimization uses a search algorithm and 
a fitness function as depicted in Figure 1. The results in [5] were computed using a combi- 
nation of SFS {sequential forward selection) algorithm and the classification error fitness 
function. Despite important reported advantages, there are also some disadvantages for 
that configuration: (1) The SFS algorithm presents the nesting effect, i.e., once a channel 
has been selected there is no possibility of discarding it [31]. (2) The SFS algorithm does 
not use random components [32]; note that this type of components can help by finding 
different solutions and potentially lead to better ones [33] in problems with more than 
one local optimum. (3) The use of the classification error as the fitness function requires 
a supervised procedure, in which the classes associated to each pattern are known, being 



Table 1 Summary of the classification parameters and errors reached in the last years 



Classifier 


Channels 


Classes 


Features 


Subjects 


Error 


Additional processing 


Reference 


Year 


MLP 


2 


4 


MAV, MAVS, ZC, SSC, WL. 


9 


9.25% 




[9] 


1993 


LDA, MLP 


5 


4 


WPT. Transient 


16 


6.25% 




[30] 


1998 


PCA, LDA 


4 


6 


WPT. Stationary 


11 


6.80% 




[13] 


2001 


LDA 


4 


4 


TD. Stationary 


12 


5% 


Majority vote 


[12] 


2003 


GMM 


4 


6 


AR6+RMS+TD. Stationary 


12 


3.10% 


Majority vote 
Suppress interclass data 


[21] 


2005 


LDA 


8 


7 


AR4+RMS. Stationary 


30 


7.40% 


Majority vote 
Suppress interclass data 


[29] 


2007 


SVM 


4 


5 


RMS +AR6. Stationary 


11 


3% 


Majority vote 
Suppress interclass data 


[23] 


2008 


LDA 


10 


11 


AR6 Stationary 


10 


6.50% 




[5] 


2009 


LDA 


10 


11 


AR6 Stationary 


10 


5% 


iPCA Transformation 


[5] 


2009 
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Figure 1 Optimization scheme used in the parameter configuration stage. 1/1/, is a PCA matrix 
transformation of size N x N for / = 1 , . . ., C. Note that each W c (for c = 1,2,.. .C) matrix has a N x N size, 
then the projected vectors has an incremental size of CN x M. 



computationally expensive, due to the need for evaluating the classifiers output at each 
iteration. 

In order to obtain a solution to the previous described disadvantages the authors pro- 
pose and evaluate in [6], a set of novel configurations for the iPCA tuning, based on two 
factors: (A) fitness function and (B) search algorithm. The A factor with two levels: {A\) 
classification error and (A2) correlation factor. The B factor with three levels: (£1) sequen- 
tial forward selection (SFS), (£2) sequential floating forward selection and (£3) artificial 
bee colony (ABC). In this context, this article evaluates a new level for the B factor: B4 
particle swarm optimization (PSO). The overall results suggest an advantage on the use of 
the PSO algorithm with respect to other studied algorithms, with regard to the running 
time during the training stage. 

Background 

Principal Components Analysis 

PCA is an orthogonal linear transformation used to convert a set of observations of 
possibly correlated variables into a set of values of uncorrelated variables called princi- 
pal components. Given M observations of an N dimensional random vector z, the PCA 
transformation is performed by firstly subtracting the mean of the vector from z [5], 
x — z — E{z}> computing the N x N covariance matrix C x = E{xx T } and, then, applying 
5 = Wx, where s is the vector of the main component and W is the matrix in which each 
column is an eigenvector of C x . Usually the M observations would typically be samples 
taken from any C possible classes. This is known as universal PCA (uPCA) or global 
PCA [34]. This property of ignoring class information allows us to argue that PCA is 
suboptimal for classification purposes [35]. 

A recent variation, called individual PCA (iPCA) [34,35], groups the M observations 
according to their class membership. Separate projection matrices W\, . . ., Wc with size 
N x N are found for each class (see Figure 1). This set of matrices can be interpreted as a 
unique CN x N size transformation matrix Wipca formed by the concatenation of rows 
of the separate projection matrices. The iPCA method effectively "tunes" the data prior 
to classification and has been shown to improve classification accuracies for some pattern 
recognition problems [36]. The main drawback of this method is the linear increment 
of the dimensionality of the patterns with the number of classes C. To overcome this 
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problem, a reduced iPCA transformation matrix Wr is defined in [5]. This solution uses 
a transformation matrix with the best N\ basis of the Wipca matrix (N\ < CN). 

Channel optimization 

The optimization task of finding a subset of N\ elements from a given set of CN chan- 
nels can be interpreted as a discrete optimization problem (integer elements from the 
selected subset). The optimization scheme in Figure 1 is used to carry out this task. In this 
scheme, a validation data set x v is projected with the Wipca matrix. The projected pat- 
tern s is a CN x M size vector. The vector O' is a subset of the CN dimensions in s and its 
components change at each iteration, following a specific strategy defined by the search 
algorithm. The fitness function evaluates the quality of an EMG pattern with regard to the 
discriminative information. The evaluated patterns by the fitness function are established 
by the rows of s which compose the O' vector. When the stop conditions are reached, the 
scheme provides the selected channel subset O as output. 

Search algorithms 

The search algorithms treated in this study are the following: (a) Sequential Forward 
Selection (SFS), (b) Sequential Floating Forward Selection (SFFS), (c) Particle Swarm 
Optimization (PSO), and (d) Artificial Bee Colony (ABC); then, the same will be described. 

Sequential Forward Selection (SFS) 

This is one of the first developed search methods in the literature related to feature selec- 
tion issue. The search procedure consists of the following steps: (a) compute the fitness 
function for each of the CN channels and then select the channel with the best value, and 
(b) form all possible two dimensional vectors that contain the winner from the previous 
step and compute the fitness for each of them, (c) select the vector with the best value, (d) 
continue the process until the N\ vector length has been completed. The main drawback 
of this algorithm is the nesting effect; that is, once a channel has been selected there is no 
possibility to discard it [32]. 

Sequential Floating Forward selection (SFFS) 

This is a suboptimal search algorithm proposed by Pudil et al. in 1994 [31] with the pur- 
pose of eliminating the nesting effects of the SFS algorithm. This algorithm begins the 
search with an initial subset of two channels. For each subsequent iteration, two tasks are 
necessary: (a) search the candidate channel which minimizes the fitness function and add 
it to the selected subset O', (b) verify if the fitness function can be optimized by replac- 
ing a channel from the selected subset O'. Consequently, the SFFS search is performed 
dynamically, incrementing and decrementing the selected channels in the subset O' until 
reach the target length N\. An efficient way to implement this algorithm is presented in 
[31,37]. 

Particle Swarm Optimization (PSO) 

This is a population based stochastic optimization technique developed by Eberhart and 
Kennedy in 1995 [38]. In this algorithm, it is assumed that there exists a swarm with S 
particles. Consider that the search space is N\ -dimensional, and the i-th particle of the 
swarm can be represented by a dimensional position vector X[ = (xa,Xi2, . . .,XiN\). The 
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velocity of the particle is denoted by v; = (v,i, v/2> . . ViN\). Additionally, consider 
the best individual position for the i particle is Pbestt = (pn>Pi2> • • ->PiNi) and also 
the first and second best position explored so far are gfbest and gsbest respectively, 
velocity of the particle and its position are updated according to (1) and (2). 

Vij(t + 1) = w * r(vij(t)) + c\ * n * ripbestij - + c 2 *r 2 * r(gfbestij - 

+ c 3 * r 3 * rigsbestij - 

Xijit + 1) = Xijit) + r(v,y(£ + 1)) (2) 

where, i is the particle index for i e 1, . . .,S; j is a specific dimension for j e 1, . . . t N\; 
w is the inertia weight; r\, r 2 and r 3 are random numbers uniformly distributed in the 
range (0, 1); c\ is a cognitive parameter; c 2 and c 3 are social parameters; r(.) is a rounding 
function used to adapt the original PSO for the discrete optimization problem [39,40]. 

The purpose of including the second best particle gsbest in the velocity update equation 
is to increase the diversity and slow convergence by addressing each particle to move 
toward the weighted sum of both the best and second best solution the swarm has gener- 
ated [41]. Considering a minimization problem the local best of each particle is updated 
according to (3). 

pbesti(t + 1) = I pb ™ m) UK eSt J?l\ ~ f ^ l) < (3) 
F \xi(t) i{f(pbest i (t))>f(x i y, 

where, /(.) is the fitness function. Finally, the first and second global best of the swarm 
are updated according to (4) and (5). 

gfbest(t + 1) = argm±n pepbest .f(p(t + 1)) (4) 

gsbestit + 1) = argminpepbest^fbesty+irfipit + 1)) (5) 

It is possible that the solution computed with the PSO algorithm contains repeated ele- 
ments (i. e. that the vector O' has repeated elements). This leads to EMG patterns with 
high level of redundancy and it does not benefit the localization of the global minimum 
on the considered fitness functions. To overcome this, a modification on the conven- 
tional PSO algorithm was added. Whenever the search process leads to a solution with 
repeated elements, then, that solution will be penalized with a high fitness value and with- 
out evaluating the fitness function. This will hinder the EMG patterns with high level of 
redundancy from being selected as the optimal solutions in the minimization problem. 

Artificial Bee Colony (ABC) 

This is an optimization technique proposed in 2005 by Karaboga [42]. In ABC, each 
solution to the optimization problem is called food source and is represented by a 
N\ -dimensional vector. The colony of artificial bees contains three groups of bees: 
employed bees, onlookers and scouts. The first half of the colony consists of employed 
artificial bees and the second half are the onlookers. For each food source there is only 
one employed bee. Therefore, the number of employed bees is equal to the number of 
food sources. An employed bee whose food source has been exhausted becomes a scout. 
The ABC algorithm can be implemented in two steps [43]. 



that 
that 
The 



(1) 
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• Initialization step: the algorithm generates a random initial solution set of length S. 
Each solution = 1, 2. . S) is a N\ -dimensional vector. The employed bees 
measures the nectar amount of each solution, return to the hive and share the nectar 
information with the onlooker bees. 

• Iterative step: the solution set is submitted to repetitive search cycles. During these 
cycles, the bees change its memory contents, searching the food source with the best 
fitness. Each bee is able to remember just one food source localization X{. Each cycle 
is computed in three phases: (a) employed phase, at this phase each employed looks 
for a new food source v; around its current position X{ using (6). If the nectar quantity 
of this new position improves the previous one, then the bee position is updated. 
Once each employed bee has finished this phase, a probability factor pi is computed 
(7). Using this factor indicates the nectar level at each X[ food source, (b) Onlooker 
phase, at this phase the onlooker bees use the probability factor pi of each employed 
bee and select a X{ food source. Afterwards, a new v; food source around the selected 
neighborhood localization is computed (6). Finally, if the fitness value of this new 
position improves the previous one, then the bee position is updated, (c) Scout phase, 
whenever the exploration of a X{ food source does not finish with a solution 
improvement, a counter increments the number of trials of that food source. If the 
value in this counter is greater than a threshold Cu m , this x\ food source is discarded 
and a new food source is randomly selected by a scout bee. 

Vij = Xij + (pij(Xij - x kj ) (6) 

where k e (1, 2, . . .S) and j e (1, 2, . . .N\) are randomly chosen indexes, with k ^ L <pij is a 
random number in the range [—1, 1] that controls the production of food source positions 
around X[. 



where /(.) represents the fitness function value and S represents the number of food 
sources. 

Fitness functions 

A fitness function is a particular type of function used to summarise, as a single figure of 
merit, how close a design solution is to achieving the setting aims. In this case, the aim 
is related with the maximum discrimination information in a subset of C x N channels 
obtained after the iPCA transformation. Two fitness functions have been used in this 
study: (a) Classification error, and (b) Correlation factor, which will be described here: 

Classification error 

This is the same factor that has been used to measure the performance of the myoelectric 
control system; therefore, it is the ideal function to compare the discriminative informa- 
tion of the EMG patterns. This factor is computed as the relation between number of 
incorrect decisions and the total number of decisions of the system. To compute this rela- 
tion, is necessary to know the predicted class vector y and, therefore, the classification 



/(*.') 



(7) 



Pi = 
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Figure 2 Steps of the pattern recognition based myoelectric control system with iPCA tuning [5]. The 

dotted line shown in the figure is the conventional classification architecture for myoelectric control systems. 
The measured signal is represented by a generic vector* and the class label by a generic vector y. 



scheme composed of the block diagrams inside the dotted line in Figure 2 (i.e. the con- 
ventional myoelectric system) must be performed. Consequently, there is a significant 
computational complexity involved in classification error computation. 

Correlation factor 

Several factors have been proposed to quantify the amount of common signals presented 
between the two dimensions of a signal, the most common being the cross correlation 
factor [44] . The equation used to compute this factor for a discrete two dimension vector 
x is 

C x (xi,x 2 ) = E[(xi - fii)(x 2 - M2>] (8) 

where E is mathematical expectation and /x; = E[xj\. For TV-dimensional vectors, the 
correlation coefficient matrix is more used, where (9) defines each element of the matrix; 
each element varies within the range [1,-1]. 

Rx(P ' q) = Ir < \r ( T (9) 
y/C x (p,p)C x (q, q) 

for p = 1, . . N; q = 1, . . ., N. 

R x is an Afx Af matrix with correlation coefficients computed for each pair of dimensions 
of the vector x. This matrix is symmetric and has ones in its main diagonal; therefore, 
to quantify the correlation level it is enough to consider the lower or upper diagonal 
elements. An alternative considering the lower elements is 

f c = sum(tr±l(\R x \)) (10) 

where tril ( . ) is a function that returns the lower triangular elements of the matrix 
without the main diagonal ones. To limit the correlation factor between [0, 100] we have 
defined the expression in (11), which has been used for computing the correlation factor 
between dimensions of the EMG signal. 

Fc = 100 X N^N (U) 
Myoelectric Control Systems with iPCA tuning 

Figure 2 depicts the myoelectric control scheme with iPCA tuning [5], which uses an 
iPCA transformation for tuning input patterns. This transformation is a variation of 
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the PCA {Principal Component Analysis) that was initially used for improving the clas- 
sification performance in problems of face recognition [34]. The purpose of the iPCA 
transformation is to generate a new signal space where the discriminative information 
related to the movement class is amplified, as long as other types of information are atten- 
uated. However, there is a drawback: the patterns' dimensions increase by a factor of C. 
Considering an input pattern corresponding to N EMG channels and a classification sys- 
tem with C classes, one has to use of a transformation matrix WiPCA with size CN x N. 
Therefore, the projection of the input signal results in a new dimension pattern CN 
(see Figure 1) [5]. 

A reduced iPCA transformation can be defined for reducing the effects of dimension 
pattern increments. In this solution, it is necessary to compute a reduced iPCA transfor- 
mation matrix Wr. This matrix projects the input patterns and generates just the N\ most 
discriminative dimensions at the output (with N\ < CN). To compute Wr it is necessary 
to solve the block diagram in Figure 1. The output of this scheme is the vector O, which 
contains N\ channels selected from the CN set. Using this vector, the relation between 
the iPCA transformation matrix and the reduced iPCA transformation matrix is defined, 
as stated in (12), where the Matlab [45] notation for sub-matrices is used. 

W R = W iP cA(0,:) (12) 
Methods 

In order to evaluate the classification performance for each fitness-function/search- 
algorithm combination, a methodology based in three steps has been applied (namely, 
EMG signal acquisition, computing the reduced iPCA matrix, and system evaluation) 
which are depicted in Figure 3. 

EMG signal acquisition 

We used data collected by the University of New Brunswick - Canada, with authoriza- 
tion of Dr. Levis Hargrove. These data were acquired in an experiment approved by 
the University of New Brunswick's Research Ethics Board [5]. The data were sampled 
from ten healthy subjects performing eleven motion classes. EMG signals were acquired 
from ten sites on the forearm using adhesive duotrodes manufactured by 3M. These sig- 
nals were amplified to guarantee potentials in range [+5, —5] V and a bandwidth of 1Hz 
to 500Hz. Afterwards, the signals were sampled at lKHz and quantized with a 16-bit 
resolution. 

Experimental data were collected during eight trials. Each trial consists of two repeti- 
tions of eleven motion classes performed in sequential order, namely: (1) wrist pronation, 
(2) wrist supination, (3) wrist flexion, (4) wrist extension, (5) hand open, (6) key grip, (7) 
chuck grip, (8) power grip, (9) fine pinch grip, (10) tool grip, and (11) rest position or no 
movement. 

The intensity of the contraction was determined by the subject, but they were encour- 
aged to contract to a level that they comfortably repeated throughout the experiment. 
During all trials, subjects elicited the contraction from the rest position, held the con- 
traction for 4 s and then returned to the rest position for a predefined inter-motion class 
delay period. Trials 1, 2, 3 and 4 used intermotion class delay periods of 3, 2, 1 and 0 s 
respectively. Trials 5-8 used inter-motion class delay periods of 2 s. The result is a data 
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Figure 3 Sequence used to evaluate the classification error. Note that the blocks Computing iPCA Matrix 
and System Evaluation must run eight times, i.e. one iteration for each of the four search algorithms with the 
first fitness function (classification error) and one iteration for each of the four search algorithms with the 
second fitness function (correlation factor). 



set composed of myoelectric signals x in each channel and a vector of target classes asso- 
ciated y. A sample of the collected data excluding the intermotion class delays is depicted 
in Figure 4. 

In the present study, the EMG data recorded from trials 1 and 3 were used as a train- 
ing data set. EMG data recorded from trials 2 and 4 were used as a test data set. And 
finally, EMG data recorded from trials 5 and 6 were used as a validation set to resolve the 
optimization problem defined in the Background Section. Otherwise, for all these sets, 
the inter-motion class delays were excluded such as reported in [5] . 

Computing the reduced iPCA matrix 

This step has been implemented in two sequential tasks depicted in Figure 5: (a) search 
of the reduced dimension vector O, and (b) building of the reduced iPCA transformation 
matrix Wr. The first task began with the computing of the iPCA transformation matrices 
using the training data x tr > Afterwards, the validation data x v were projected with iPCA 
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0 10 20 30 40 50 60 70 80 



Figure 4 Acquired signal with the sequence of movements from 1 to 1 1 . Signals illustrated correspond 
to channel 3 during trials 6 to 8 and user 1 . Dotted lines indicate the held time for the contraction (4 s). Note 
that the movements were evaluated in sequential ascending order for all trials. 



transformation matrices and the transformed patterns s were used to run the optimal sub- 
set channel search. The result of this search is the vector O composed of the N\ channels 
with most discriminative information. This search was evaluated with the eight possible 
combinations of fitness-function/search-algorithm. Therefore, at the end of this task the 
subsets Oij were obtained, where the sub-indexes i, ; indicate the fitness function and the 
search algorithm, respectively; for i = 1, 2; j = 1, 2, 3, 4. 

In this case, for all search algorithms the N\ parameter was set to 30, such as 
recommended in [5]. This value provides good classification accuracy. 

The SFS algorithm was configured with just one stop criterion, namely: length of the 
selected subset equal to Ni. 

The SFFS algorithm used two stop criteria: (1) the same defined in SFS and (2) the 
maximum number of iterations (C max = 120). The latter was selected considering tests 
which show a mean number of iterations of 78.3 (with a standard deviation of 23.04) to 
find a solution. The PSO and ABC algorithms used other kind of parameters summed up 
in Table 2 and Table 3, respectively. For both fitness functions the optimization problem 
was defined as a minimization and the parameters were set with the next criteria: 



iPCA 
Transformation 



Optimization 



Computing of the 
- Transformation 



W- 



Row selection 



w„ 



Figure 5 Blocks for computing the reduced iPCA Transformation Matrix, (a) search of the reduced 
dimension vector 0 (b) building of the reduced iPCA transformation matrix Wr. Note that this process is 
conducted during a configuration stage, before the classification. 
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Table 2 PSO algorithm parameters 


Symbol 


Parameters 


Value 


6 


Minimum value of fitness function 


0 


Cl 


Cognitive parameter 


2 


Cl 


Social parameter 


2 


C3 


Social parameter 


1 


w 


Inertial weight 


[0.8,0.1] 


5 


Swarm size 


10 


Wmini v max] 


Velocity limits 


[-109, 109] 


[ x mini x max\ 


Position limits 


[1, 110] 


Cmax 


Maximum number of iterations 


120 



For the PSO algorithm the minimum value of cost function (e) was defined by the 
minimum possible value for the classification error and correlation factor (in this case 
equal to zero). The number of particles (S) was set to 10; for greater values, the search 
time was increased and the time for the experiments could exceed the available time 
(2 months to get experimental data), specially when the classification error were used as 
fitness function. For lower values, the search performance could be compromised due to 
the characteristics of population based stochastic optimization algorithms. The limit val- 
ues (v m in, v max , Xmi n , x max ) were defined by the dimension of the search space: C x N, 
where C is the number of movement classes (C = 10) and N is the number of acquisition 
channels (N = 11). The maximum number of iterations C max was set to 120; this value 
showed good results for both fitness functions. The algorithm used similar values for cog- 
nitive and social parameters (c\, <?2 and C3) allowing a neutral strategy, i.e. a strategy were 
the confidence in the experience of each particle is equal to the confidence in the expe- 
rience of the swarm. The inertial weight w decreases linearly allowing that the particles 
polish the search in the last iterations; this configuration induces an effect of global search 
in the first iterations and local search at the last ones. 

In the ABC algorithm the parameters were configured having two requirements in 
mind: (a) high performance in the optimal search, (b) assurance of conditions to make fair 
comparisons between bio-inspired algorithms. Then, the number of food sources (S) was 
configured to 10, equal than the number of particles in the PSO algorithm. The limit val- 
ues (v m int v m ax) were defined by the dimension of the search space: C x N. The number 
of maximum cycles (C max ) was set 120, equal that the maximum number of iterations in 
the PSO algorithm. The minimum value of the fitness function was set to 0, because this 
was the minimum value achievable by both fitness functions: classification error and cor- 
relation factor. The limit of cycles to improve a solution (Q/ m ) was configured to 6; this 
value showed good classifications performances and affordable times of search. 

To compute the classification error fitness function, the processing scheme formed 
by the blocks inside the dotted line in Figure 2, (i.e. the conventional myoelectric con- 
trol system) has been used. The configuration was the following: overriding windowing 

Table 3 ABC algorithm parameters 



Symbol Parameter Value 



6 


Minimum value of fitness function 


0 


5 


Number of food sources 


10 


Wmini v max\ 


Limit values 


[1,110] 


Cmax 


Number of maximum cycles 


120 


Qim 


Limit of cycles to improve a solution 


6 
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feature extraction [12] with lengths of 150 ms and sliding windows of 25 ms, features 
conformed by the first 6 autoregressive coefficients (AR6) [18], dimensionality reduc- 
tion method based on Uncorrelated LDA (ULDA) [29] and LDA {Linear Discriminant 
Analysis) classifier [46]. At this stage, each channel was independently used to train and, 
subsequently, to test the control system, such as presented in [5], 

At the second task (row selection block in Figure 5) we used equation (12) to com- 
pute the reduced iPCA transformation matrix corresponding to each pair fitness func- 
tion/search algorithm. Note that, at the output, eight reduced transformation matrices 
W\ were obtained as consequence of an optimization procedure computed eight times 
(the sub-indexes indicate the fitness function and the search algorithm, respectively). 

System evaluation 

The last proceeding of the methodology was the evaluation of the classification system. 
This was carried out in two phases as depicted in Figure 6: (a) the test data (xtst) clas- 
sification, and (b) the classification error computing. The first phase was achieved with 
the iPCA preprocessing and the conventional classification scheme (see Figure 2), using 
the same configuration described for the classification error fitness function computing. 
This process was computed for each of the reduced iPCA transformation matrices W\\ 
the output of this phase was the predicted class vector yij. The second phase compares the 
vector yij with the target vector y to compute the classification error as defined in equation 
(13). The classification error was individually evaluated for each user k of the system. 



The simulations proposed in this paper were performed on the Matlab platform and are 
based on the myoelectric control toolbox presented in [29]. Nevertheless, we have made 
some modifications and added some new functionalities to the myoelectric control tool- 
box presented in [29]. These modifications were necessary to appropriately parse the 
EMG data base that had a different structure than the used in [29] . The added functional- 
ities were done for configuring and implementing all the iPCA related functions, namely: 
optimal search, transformation matrices computing projections and performance indi- 
cators computing. The resulting simulator is a powerful tool for experimentation with 
iPCA myoelectric control, designed on a modular fashion that permits evaluation of dif- 
ferent approaches in each of the processing steps. Here we show results separated in two 
main stages: {a) first, the results registered during reduced iPCA transformation matrix 
computing, and secondly, (b) results registered during EMG pattern classification. 



eyik) = 100 x 



sumjy ^yjjjk)) 
lengthiy) 



(13) 



Results 



I Reduced iPCA Tuning 



X 



tst 




Conventional 
Classification 
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9- C 



Error computing 



v. 



Figure 6 Blocks for system evaluation. The dotted line shown in the figure contains the test data 
classification tasks. 
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SFS SFFS PSO ABC SFS SFFS PSO ABC 

Search algorithm Search algorithm 



Figure 7 Comparison of search time used for each of the eight treatment alternatives evaluated 
during optimization. The search time was computed as the product between the number-of-iterations and 
the mean-iteration-running-time. 



W R W R 




SFS SFFS PSO ABC SFS SFFS PSO ABC 

Search algorithm Search algorithm 



Figure 8 Comparison of the fitness value reached with each of the eight treatment alternatives 
during the optimization search. The sequence of algorithms in each figure was selected according to the 
descending order of fitness. 
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Table 4 Comparison of number-of-iterations/mean-iteration-running-time for the eight 
treatment alternatives during optimization 



Fitness function 


SFS 


SFFS 


PSO 


ABC 


Class, error 


30/1 282s 


64.9/1 705s 


120/1 55s 


120/475S 


Corr. factor 


30/5 .62s 


78.3/7.61 s 


120/0.79S 


120/1. 12s 



The results of the first stage of the system are shown in Figure 7 and Figure 8. Figure 7 
depicts the mean search time, i.e. the time necessary to compute the optimal vector O. 
The left figure shows the search time for classification error fitness function and the right 
figure for correlation factor fitness function. Complementary information on this subject 
is presented in Table 4. This table summarizes the mean values of the number of iter- 
ations and the time of each single iteration for each fitness-function/search-algorithm 
combination. Note that both bio-inspired algorithms were stopped by the maximum iter- 
ations criteria (120 iterations); this means that the time performance comparison among 
bio-inspired algorithms just depends on the time of each single iteration of the search 
algorithm. The SFS algorithm was stopped when reached 30 iterations and the SFFS have 
different mean iterations number for each fitness function. The times in Table 4 have 
been computed on a PC with 2.8GHz processor, 8Gb RAM memory and 4 cores. Figure 8 
depicts the mean fitness value reached during the optimal search. This value is presented 
for the eight evaluated combinations of search algorithm and the fitness function. 

The results of the next main stage (the EMG pattern classification stage) are shown 
in Figure 9. This figure displays the mean classification errors generated when each of 
the eight reduced matrices W\ were used in the myoelectric control system. The black 
bars indicate the classification errors obtained with the transformation matrix W% (i.e. 
the transformation matrices computed with the classification error fitness function) and 
white bars indicate the classification errors obtained with the transformation matrix W% 
(i.e. transformation matrices computed with the correlation factor fitness function). The 
vertical line on the bars represents one standard deviation of inter-subject error variabil- 
ity. The red line indicates the mean classification error obtained when the EMG patterns 
were classified without the iPCA transformation. 

A two factor analysis of variances (ANOVA) was performed over the error classifica- 
tion results, which has determined the following behaviors: (a) there was no statistical 
evidence of the dependence between the interaction of both factors (i.e. search algorithm 
and fitness function) and the classification error (Fn.01,3,72 = 4.0659 > Jab = 0.09), 
(b) there was statistical evidence of the dependence between the fitness function factor 
and the classification error (Fn.02,1,72 = 5.0162 < f A = 6.56), and (c) there was no sta- 
tistical evidence showing the dependence between the search algorithm factor and the 
classification error factor, (Fn.01,3,72 = 4.0659 > fs = 0.08). 

Discussion 

The analysis of simulation results may be performed taking into account two major 
aspects: (a) the computational complexity of the search process, and (b) the overall classi- 
fication accuracy, using a fitness function. The metric used to measure the computational 
complexity was the number of iterations and the computational time of each iteration 
(see Table 4). The product of these two metrics defines the search time (i.e. the time nec- 
essary to find an optimal solution) and gives an estimate of the computational complexity. 
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With regard to the number of iterations is important to mention that: (a) the bio-inspired 
algorithms used the maximum number of iterations (120) in each search and (b) the 
sequential algorithms never reached the maximum number of iterations (120). Therefore, 
the number of iterations is irrelevant to develop comparisons among bio-inspired 
algorithms; otherwise, the same is important to compare both sequential and bio-inspired 
algorithms. The search time in Figure 7 shows that: (a) solutions founded with error 
classification fitness function are more susceptible to long search times. The data have 
shown a relation of 250:1 between the search time of the classification error fitness 
function and the search time of the correlation factor fitness function. This was already 
expected due to the complexity associated with the supervised and non-supervised fitness 
functions (see fitness functions in Background Section); (b) the major and minor search 
time of the algorithms, regardless of the fitness function, are for SFFS and PSO respec- 
tively. The difference between these two algorithms is significant and indicates that the 
SFFS algorithm consumes almost four times more iterations than the PSO algorithm, for 
the case of maximum number of iterations = 120. The PSO algorithm is 14 percent better 
than the ABC algorithm, when correlation factor was used as fitness function. The differ- 
ence (with respect to SFS) reaches 31 percent when classification error is used as fitness 
function. 

This advantage of PSO over the other evaluated search algorithms is consequence of the 
low number of evaluations of the fitness function in each iteration of the PSO algorithm. 
The fitness values shown in Figure 8 indicate the following behaviors: (a) the fitness com- 
puted on the classification error fitness function was lower than that computed with 
correlation factor as fitness function; (b) the optimal solutions computed with sequential 
algorithms were closer to the global minimum than the ones computed with the bio- 
inspired algorithms. These results suggest a superiority of the solutions computed with 



Figure 9 Comparison of the classification error reached with each of the eight treatment alternatives. 

The error bar represents one standard deviation of intersubject variability. The red line indicates the mean 
classifcation error obtained when the EMG patterns were classified without the iPCA transformation. 
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sequential algorithms. In order to generalize that behavior, it is necessary to test the bio- 
inspired algorithms using other configuration parameters; (c) the fitness reached with the 
sequential algorithms SFS and SFFS are similar and, therefore, we can suggest that the 
nesting effect associated with the SFS algorithm is not significant in the considered fitness 
functions. 

Figure 9 shows the comparison of the performances reached when the patterns were 
tuned with the different transformation matrices W\ (i = 1,2;/ = 1, . . .,4). The results 
indicate the following aspects: (a) the classification rates are similar to the previous ones 
published in [5] and the classification error of the conventional myoelectric control archi- 
tecture is superior to the classification error of the myoelectric control architecture with 
iPCA tuning; (b) there are similar levels of classification error when transformation matri- 
ces W£ were used (i.e. transformation matrices computed with the i fitness function and 
each of the search algorithms); and (c) there are differences between classification errors 
when transformation matrix W% was used (i.e. transformation matrices computed with 
the / search algorithm and each of the fitness functions). 

The response presented in (a) validates the superiority of the iPCA tuned architec- 
ture over the conventional myoelectric control architecture. Otherwise, the (b) response 
was not expected. Due to the superiority of fitness that was computed with sequen- 
tial algorithms (see Figure 8), greater differences between the classification errors were 
expected. For instance, that classification errors associated to transformation matrices 
computed with sequential algorithms (W% with / = 1,2) were less than classifica- 
tion errors associated to transformation matrices computed with bio-inspired algorithms 
(W% with / = 3,4). This suggests that finding of the minimal value for the fitness func- 
tion is not a sufficient condition to guarantee minimal classification errors during the 
evaluation of the myoelectric control system. 

Finally, the (c) response was expected, basically by two reasons: (/) the superiority of 
classification error over correlation factor in determining the discriminant information in 
the selected subsets, (ii) The superior fitness of computed solutions using systems with 
classification error fitness function (see Figure 8). 

The confusion matrices displayed in Additional file 1: Table S5, Additional file 2: Table 
S6, Additional file 3: Table S7 and Additional file 4: Table S8 provide a direct compari- 
son between the optimization alternatives used. Additional file 1: Table S5 and Additional 
file 3: Table S7 compare classification accuracy when the transformation matrices W% 
were used (i.e. the transformation matrices computed with the correlation factor fit- 
ness function and the sequential and bio-inspired algorithms). Additional file 2: Table S6 
and Additional file 4: Table S8 compare classification accuracy when the transformation 
matrices W% were used (i.e. the transformation matrices computed with the classification 
error fitness function and the sequential and bio-inspired algorithms). 

In Additional file 1: Table S5 and Additional file 3: Table S7, the values in white (left 
columns) show processing with SFS algorithm and the values in gray (right columns) 
show the results with SFFS algorithm. In Additional file 2: Table S6 and Additional file 4: 
Table S8, the values in white (left columns) show processing with PSO algorithm and the 
values in gray (right columns) show the results with ABC algorithm. 

The results along the main diagonal are correct classifications (accuracy) and those 
lying outside of the main diagonal are incorrect classifications. Empty cells correspond 
to an error of 0% and the accuracies were rounded to the nearest tenth of a percent. The 
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values on these tables confirm two trends: (a) similar levels on classification error when 
transformation matrices W£ were used and (b) differences between classification errors 
when transformation matrices W% were used. 

Conclusions 

This paper has presented and evaluated the use of bio-inspired optimization algo- 
rithms in the iPCA stage of Myoelectric Control Systems for hand prosthesis. The 
influence of fitness function and searching algorithms on myoelectric control sys- 
tems with iPCA tuning were investigated in terms of optimization performance 
(e.g. running time, number of iterations, mean search time), solution fitness and clas- 
sification performance. The alternatives considered for fitness functions were the fol- 
lowing: classification error, correlation factor, and for search algorithms: SFS, SFFS, 
PSO and ABC. The experimental results suggest superiority on classification perfor- 
mance when reduced iPCA matrices computed with classification error fitness function 
were used. The results have also shown the independence of classification perfor- 
mance with regard to the search algorithm. However, a practical advantage was found 
in using PSO algorithm during the optimal search. This advantage is related to the 
computational time of the process during the parameter configuration stage and was 
corroborated for a particular set of configuration parameters in the algorithm. Future 
studies will be carried out to investigate other configuration parameters in the PSO 
algorithm as well as the effects of the selected subset length N\ on the classification 
performance. 

Nomenclature 



MCSs 


Myoelectric Control Systems 


iPCA 


individual principal component analysis 


PCA 


principal component analysis 


ABC 


artificial bee colony 


PSO 


particle swarm optimization 


EMG 


electromyographic signals 


SFS 


sequential forward selection 


SFFS 


sequential floating forward selection 


MLP 


multilayer perceptron 


NNs 


neural networks 


LDA 


linear discriminant analysis 


ULDA 


uncorrelated linear discriminant analysis 


GMMs 


Gaussian mixture models 


HMMs 


hidden Markov models 


SVMs 


support vector machines 


TD 


Time Domain 


MAV 


mean absolute value 


MAVS 


mean absolute value slope 


ZC 


zero crossings 


SSC 


slope sign changes 


WL 


wave length 


AR 


autoregressive 
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STFT 
WT 
WPT 
TDAR 

x 

y 
y 
h 

Xy 

Xtr 
Xtst 

Xi 



Pbesti 

gfbest 
w 

gsbest 
r\ y r2 and 

c\ 

C2 and C3 
Pi 

Cmax 

S 
M 

O' 

o 



C x 
R x 
F c 

j 
i 

C 
N 



small time Fourier transform 

Wavelet transform 

Wavelet packets transform 

concatenated TD and AR features 

generic EMG signal 

vector of classes of movement 

vector of predicted classes of movement 

vector of predicted classes of movement computed with W\ 

classification error of the system configured with the W\ transformation 

matrix 

validation data set of EMG signals 
training data set of EMG signals 
test data set of EMG signals 

position vector of the i particle and position vector of the i food source in 

the PSO algorithm and ABC algorithm respectively 

velocity vector of the i particle and new position vector around X{ in the 

PSO algorithm and ABC algorithm respectively 

best individual position for the i particle in the PSO algorithm 

first best position explored so far in the PSO algorithm 

inertial weight in the PSO algorithm 

second best position explored so far in the PSO algorithm 

random numbers uniformly distributed in the range (0, 1) 

cognitive parameter in the PSO algorithm 

social parameters in the PSO algorithm 

probability factor of each employed bee in the ABC algorithm 

limit of cycles to improve a solution in the ABC algorithm 

maximum number of iterations and maximum number of cycles in the 

PSO algorithm and ABC algorithm respectively 

random number in the range [ — 1, 1] that controls the production of 

food source positions around X{ in the ABC algorithm 

iPCA projected pattern from x 

number of observations of the EMG signal 

vector of possible optimal set of channels 

vector of optimal set of channels 

vector of optimal set of channels computed with the i fitness function 

and the ; search algorithm 

minimum value of fitness function 

cross correlation factor 

correlation coefficient matrix 

normalized correlation factor 

index for the search algorithm 

index for the fitness function 

number of classes of movement 

length of the pattern vector 

length of the reduced size vector 
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S the number of particles and bees used during optimal search with 

bio-inspired algorithms 
W PCA transformation matrix 

W c PCA transformation matrix for the c movement class 
WiPCA iPCA transformation matrix 

Wr reduced iPCA transformation matrix 

W\ reduced iPCA transformation matrix computed with the i fitness 
function and the ; search algorithm 



Additional files 



Additional file 1 : Table S5 Confusion matrix for the sequential algorithms and correlation factor fitness 
function. 

Additional file 2: Table S6 Confusion matrix for the bio-inspired algorithms and correlation factor fitness 
function. 

Additional file 3: Table S7 Confusion matrix for the sequential algorithms and classification error fitness 
function. 

Additional file 4: Table S8 Confusion matrix for the bio-inspired algorithms and classification error fitness 
function. 
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